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Abstract — This paper answers a question raised by Doyie 
on the relevance of the Witsenhausen counterexample as a toy 
decentralized control problem. The question has two sides, the 
first of which focuses on the lack of an external channel in the 
counterexample. Using existing results, we argue that the core 
difficulty in the counterexample is retained even in the presence of 
such a channel. The second side questions the LQG formulation 
of the counterexample. We consider alternative formulations and 
show that the understanding developed for the LQG case guides 
the investigation for these other cases as well. Specifically, we 
consider 1) a variation on the original counterexample with 
general, but bounded, noise distributions, and 2) an adversarial 
extension with bounded disturbance and quadratic costs. For 
each of these formulations, we show that quantization-based 
nonlinear strategies outperform linear strategies by an arbitrarily 
large factor. Further, these nonlinear strategies also perform 
within a constant factor of the optimal, uniformly over all possible 
parameter choices (for fixed noise distributions in the Bayesian 
case). 

Fortuitously, the assumption of bounded noise results in a 
significant simplification of proofs as compared to those for the 
LQG formulation. Therefore, the results in this paper are also 
of pedagogical interest. 

I. Introduction 

Recently, we provided the first provably approximately 
optimal solution to the Witsenhausen counterexample and its 
vector extensions [1], [2]. The solutions are obtained using 
techniques from information theory that help us understand 
the implicit communication between the controllers: the ability 
of one controller to 'talk' to the other by making changes to 
the state of the system. The counterexample was discussed 
quite a bit in the symposium on 'Paths ahead in the science 
of information and decision systems' held in November 2009 
at MIT LIDS in honor of Prof. Sanjoy Mitter. The ensuing 
discussions led Prof. John Doyle to question the relevance of 
the counterexample as a toy problem in decentralized control. 
The goal of this paper is to convince the reader of that 
relevance. 

It is hard to define what constitutes a useful and relevant 
toy problem. In order to obtain a better understanding of what 
such a problem could be, it is useful to look at the follow- 
ing toy problem from the neighboring field of information 
theory: communicating a source across a power-constrained 
AWGN channel to minimize the average quadratic distortion 
in reconstructing the source. The problem is a toy because 
it caricatures the real world in three ways: communication 
problems today are never just point-to-point links, noise is 
rarely Gaussian, and a quadratic distortion cost is not the 
perceptually 'correct' cost criterion for most sources [3]. Even 
though these assumptions make it a toy problem, it is a useful 



toy: it distills the problem of transmitting a source across a 
channel — an aspect that is inherent in all practical problems 
— in a minimalist fashion. A solution to this AWGN problem 
provided the foundation for system architectures (e.g. separa- 
tion of source and channel coding) and coding techniques for 
larger communication problems (e.g. see [4]) including those 
with multiple transmitters and receivers, multiple antennas, 
non-Gaussian noise, etc. 

With this understanding, is Witsenhausen's counterexample 
a relevant toy? Similar to the point-to-point communication 
problem, the counterexample distills the possibility of implicit 
communication that appears to be ubiquitous in decentralized 
control systems. Why, then, may it not be relevant? Doyle's 
first argument rests on the work of Rotkowitz and Lall [5], 
which shows that with extremely fast, infinite-capacity, and 
perfectly reliable external channels, the optimal controllers are 
linear not just for the Witsenhausen counterexample (which 
is a simple observation), but for more general problems as 
well. Given that using an external channel is often a valid 
engineering option in decentralized control problems, Doyle 
argued that Witsenhausen's counterexample may be artificially 
hard because it does not allow the controllers to talk over 
an external channel, and instead forces the controllers to 
talk implicitly through the plant. The toy may be irrelevant: 
the architectural freedom of installing an external channel 
seemingly obviates any need for implicit communication. 

In practice, however, an external channel never has infinite 
capacity or perfect reliability, which is what motivates a 
growing body of the control theory literature (for exam- 
ple [6], [7]) that addresses the issue of control over noisy 
and finite-capacity communication channels. In the presence 
of an imperfect external channel connecting the two controllers 
in Witsenhausen's counterexample, Martins [8] shows that 
while finding optimal solutions continues to be hard, one can 
design signaling-based nonlinear strategies guided by those 
developed for the original counterexample. Martins also shows 
that in some cases, nonlinear strategies that do not even 
use the external channel can outperform linear strategies^] 
Provisioning for a very high SNR external channel, which 
has its own installation and operating costs, may therefore 
be unnecessary as long as nonlinear control techniques are 

1 A similar problem is considered by Shoarinejad et al in [9], where 
noisy side information of the source is available at the receiver. Since the 
channel in formulation of [9] is even more constrained than that in [8], and 
nonlinear strategies outperform linear even without using the external channel 
for Martins's problem, they outperform linear for Shoarinejad's problem as 
well. 



used. In a companion paper [10], we consider this problem 
in greater detail and show that signaling-based nonlinear 
strategies can outperform linear ones by an arbitrarily large 
factor for any chosen finite-capacity external channel. We also 
derive approximately-optimal strategies which do make use 
of the external channej^J but even these results build on an 
understanding of the original counterexample, justifying its 
relevance as a toy problem. 

Doyle's second argument is about the relevance of the LQG 
framework in Witsenhausen's counterexample. Linearity is 
fine, but do we believe that primitive random variables are 
Gaussian? Or that the designer is wedded to quadratic costs? 
The answer is no! Primitive random variables are almost never 
Gaussian, and the cost function is chosen more freely by the 
designer — the quadratic case is only one amongst many 
possible formalizations of the intuition that the cost increases 
at an increasing rate. As suggested by Doyle, of interest here 
is the work of Rotkowitz [14]. Rotkowitz shows that for 
the adversarial ^-induced norm, as opposed to the original 
expected quadratic cost in Witsenhausen's formulation, linear 
control laws are optimal and easy to find. At the same time, 
noise and initial state realizations can be completely arbitrary. 
Doyle's implicit argument, based on Rotkowitz's observation, 
is that because there is nothing sacred about the choice of 
a norm, viewed through the lens of a different {i.e. induced) 
norm (and with fewer assumptions), Witsenhausen's problem 
does not require implicit communication Indeed, with an 
induced norm, the problem seems no more intriguing than 
other team-theoretic problems with two controllers. 

The rest of this paper addresses this second argument. 
The induced-norm takes a frequentist's approach and further 
assumes that nothing is known about the state and noise values 

— they can be completely arbitrary. The control strategy is 
therefore paranoid, and budgets for all possible values of 
state and noise, fearing for the worst. In the cost function, 
this is reflected as a maximization over the state and noise 
values. Because no assumptions are made on how large the 
noise and state values can be, maximization of an unnormal- 
ized quadratic cost would diverge to infinity for any control 
scheme. To prevent this, the maximization is performed over 
a quadratic function of state and noise that is normalized 
with the size (a quadratic sum) of state and noise realizations. 
This "gain-perspective" is commonly adopted in understanding 
input-output stability [16, Pg. 430] of nonlinear systems. It 
characterizes how the norm of a signal changes as it passes 
through a system. 

In practical engineering contexts, however, one often knows 

2 As is suggested by what David Tse calls the "deterministic perspective" 
(along the lines of [1 1]— [13]), linear strategies do not make good use of the 
external channel because they only communicate the "most significant bits" 

— which can be estimated reliably at the second controller anyway. So if the 
uncertainty in the initial state is large, the external channel is only of limited 
help and there remains a substantial advantage in having the controllers also 
talk through the plant. 

3 It does not appear that this was Rotkowitz's original motivation. He 
was motivated because the idea was surprising enough that no one believed 
him [15]. 



the "typical" values of state perturbations and noise realiza- 
tions. The normalization in the gain perspective then does not 
reflect the actual costs incurred by the system. For instance, 
when the state perturbations and observation noises are small, 
the state estimates are more reliable, and therefore the control 
costs are often smaller. While a plain quadratic cost criterion 
reflects these smaller costs, the gain-perspective of induced- 
norm approach does not. 

In order to demonstrate our point, we look at the coun- 
terexample from both Bayesian and frequentist perspectives. 
To model the knowledge of "typical" values of primitive 
random variables, we assume merely that the noise is bounded, 
and this bound is known. Our Bayesian model (Section |lllj ) 
is inspired from uniformly distributed noise. It considers an 
average quadratic cost assuming further that the distribution 
of the initial state is Gaussian, but departs from the LQG 
model in that the distribution of noise is bounded and known. 



Our frequentist model (Section IV i goes a step further and 



considers a worst-case unnormalized quadratic cost assuming 
there is no prior distribution on the state and the noise. Yet, for 
both of these formulations, implicit communication can not be 
ignored. Quantization-based implicit-communication strategies 
can outperform linear strategies by an arbitrarily large facto^] 
and these strategies also attain within a constant factor of 
the optimal cost. In the Bayesian case, the constant factor is 
reasonably small for uniform noise, as it was for the Gaussian 
case in [1], [2], but it can be large for other distributions. 
When it is large, improved implicit-communication strategies 
will be needed in order to attain within a small constant factor. 

Fortuitously, the proofs for bounded noise formulations 
considered in this paper are substantially simpler than those 
for the LQG formulation — a finite-length analysis in the style 
of [2] is not needed to show approximate optimalit}^] 

II. Notation and problem statement 
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Fig. 1. Block-diagram for the vector Witsenhausen counterexample [1]. 

Vectors are denoted in bold, with the superscript to denote 
their length (e.g. x™ is a vector of length m). Upper case is 
used for random variables or random vectors (except when 
denoting power P), while lower case symbols represent their 
realizations. Hats (~) on the top of random variables denote 
the estimates of the random variables. The block-diagram for 
the formulations considered in this paper is shown in Fig. [T] 

A control strategy is denoted by 7 = (71,72), where 7* 
is the function that maps the observation y" 1 at Cj to the 

4 These results are based on similar results by Mitter and Sahai [17] for the 
original counterexample. 

5 Even though a finite-length analysis is needed to obtain tighter bounds on 
the associated constant factors. 



control input u™ 1 . The observations are given by y" 1 = x™ 
and y™ = x" 1 + z m , where z m is the disturbance, or the 
noise at the input of the second controller. For the first two 
formulations, the total cost is a quadratic function of the state 
and the input given by: 
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where u" 1 = 71 (x™), x™ = x™+7i(x™) — u™ where = 
72(x™+7i(x™)+z m ). The cost expression includes a division 
by the vector-length m to allow for natural comparisons 
between different vector-lengths. 

We now provide the two problem formulations that are 
addressed in this paper. 

A. Bayesian approach: a stochastic formulation 

The initial state X™ is Gaussian, distributed W(0, a^l m ), 
where I m is the identity matrix of size m x m. The observation 
noise Z m is distributed iid according to distribution fz{z) 
with finite differential entropy h(Z), finite variance a 2 , and 
bounded support contained in (—a, a). Without loss of gener- 
ality, we assume that a 2 z = 1. For example, for a uniformly 
distributed Z, a 2 = 1 for a = \/3- 

The control objective is to minimize the expected quadratic 
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over the choice of 7. The cost is averaged over the random 
realizations of X™ and Z m . We use the variable P := 
^E [||U"'|| 2 ] to denote the power of the input u 1 ™, and 
minimum mean-square error MMSE = ^-E [JX™!! 2 ] = 
~E [||X™ - U™|| 2 ] to denote the second stage cost. 

B. Frequentist approach: an adversarial formulation with 
quadratic cost 

The block-diagram is the same as that for the stochastic 
problem. The total cost is still the same function given by Q, 
however, the cost for a strategy 7 is given by the maximum 
cost under the constrain^ that \zi\ < \/3 for all i. That is, 

r(7) - sup J (7) (*o' 
M|z"MU<V3 
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III. Stochastic models for state and noise 
A. Upper bound on costs 

Theorem 1: An upper bound on the opti mal a verage costs, 
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for the stochastic problem of Section II-A is given by 



J opt < min 
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Proof: We consider the following three strategies 1) a 
scalar quantization strategy that quantizes the entire real line 
using uniform quantization-bins of size 2a in each dimension, 
2) the zero-input strategy, followed by LLSE estimation at the 
second controller, and 3) the zero-forcing strategy. For a given 
(k, ao)-pair, the strategy with minimum cost is chosen. 

6 The bound of \/3 is so chosen because it simplifies the derivations of 
upper and lower bounds. 



For the quantization strategy, the input forces the state to 
the nearest quantization point. The magnitude of the input is 
therefore bounded by a. Since the bins are disjoint, there are 
never any errors at the second controller (because the noise 
is smaller than a). The total cost is therefore upper bounded 
by k 2 a 2 . For zero-input strategy with Linear Least-Square 
Estimation (LLSE), the cost is the same as that in the Gaussian 



case zero-input strategy of [1] of 



(because MMSE and 



LLSE operations are the same in the Gaussian formulation, 
and LLSE error depends on the distribution only through the 
variance of the random variable). For zero-forcing, the input 
is forced to zero, and thus the cost is k 2 a\. This completes 
the proof. ■ 

B. A lower bound on the costs 

Theorem 2: A lower bound on the costs for the stochastic 



problem of Section II- A| with observation noise Z of variance 



1 and differential entropy h(Z) is given by 



J opt > inf k 2 P 



where 



k(P) 
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Proof: The proof follows the lines of the proof of 
Theorem 3 in [1]. For a fixed P := ^E [||U5™|| 2 ], we first 
obtain a lower bound on the MA4SE. We need the following 
lemma [1, Lemma 3]. 

Lemma 1: For any three random vectors A, B and C, 



y/E p - C|| 2 ] > y/E p - C\\ 2 } - y/E p - B\\ 2 }. 
Proof: See [1]. ■ 
Substituting Xo" for A, X^ for B, and U?? 1 for C in Lemma[T] 
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We wish to lower bound E [||X™ — U™||]. The second term 
on the RHS is smaller than V mP. Therefore, it suffices to 
lower bound the first term on the RHS of (|7j. If we interpret 
U™ as an estimate for X™, this term represents the MMSE 
in reconstruction of X™ across the Xi — Y2 channel. 

Lemma 2: The 
bounded as follows 
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Proof: See Appendix IT 
We can now obtain a lower bound on the MMSE in recon- 
structing Xq™ as follows: X™ is a Gaussian source that is 
reconstructed across a channel of mutual information (and 
hence also the capacity) upper bounded by the expression 
in ( fj"5) , The MMSE in reconstructing X™ is therefore lower 
bounded by mD <J 2(Cx 1 -Y 2 ) where D^{R) := a 2 2~ 2R is the 
distortion-rate function [18, Ch. 13] of a Gaussian source, and 
Cx ± -y 2 lS tne capacity across the X\ — Y2 channel. 
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Thus, the MMSE in reconstructing X™ is lower bounded variance). Thus, 
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A lower bound on the MMSE follows from {7) and |9|. 
The theorem follows from the minimizing the sum of k 2 P and 
MMSE over non-negative values of P. ■ 
Observe that the proof does not make use of the bounded 
nature of the noise. The theorem is thus applicable to Gaussian 
noise as well, and is therefore a generalization of the lower 
bound in [1]. 

C. Quantization-based strategies are approximately optimal 

We now show that the upper bound in Theorem [T] is within 
a constant factor of the lower bound in Theorem |2j 



Theorem 3: For the problem as stated in Section II-A 
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and the upper bound is achieved by 



quantization-based strategies, complemented by linear strate- 
gies. For example, for Z ~ U(— v3, v3), the uniform distri- 
bution of variance 1, p < 50. 

Proof: The proof is along the lines of proof of Theorem 
1 of [1]. We use P* to denote the optimizing value of P in 
the lower bound. We consider two cases: 



If P* > 
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, ;l , using zero-forcing strategy, we have an 
upper bound of k 2 a 2 . The lower bound is larger than k 2 P* 
which in this case is larger than k 20Q — . The ratio is thus 
smaller than 

If P* < 
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where (a) follows from the fact that h(Z) < | log 2 (2ire), the 
differential entropy for the Af(0, 1) random variable (Gaussian 
distribution maximizes the differential entropy for a given 
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which is also a lower bound on the total cost. Using the zero- 

2 

input upper bound of J^'^ < erg, the ratio in this case is 

upper bounded by max {^i^y , } ■ 
Case 2: Oq > 1. 

If P* > 2 2 q Q , using the upper bound of k 2 a 2 , the ratio of 
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upper and lower bounds is smaller than - ^ h ( Z) 
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(again, because Gaussian distribution 
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maximizes the differential entropy for given variance), 
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where (b) holds because the expression in the RHS of (a) is 
an increasing function of (To- Thus, the following lower bound 
holds for the MMSE error 



MMSE > 2 2hi ~ z) 
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Using the zero-input upper bound, the ratio is smaller than 
2 lh{z) ■ The ratio in this case is therefore smaller than 

200^ 



max 



f 200a 2 145 1 
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case is 

The result now follows from 



2 2h(Z) 

the observation that a > 1, since the variance of Z is 1. ■ 
Note that the result is not asymptotic — the constant factor is 
uniform over all vector lengths, though it can be improved 
using lattice-based strategies of [2] for upper bound, and 
sphere-packing bounds [2] for lower bound. 

Remark: For the original counterexample, our results in [2] 
provide a constant factor that is uniform over the problem 



parameters (k, Og). The constant factor of 



200a- 

2 2h(z; 



here depends 



on a and h(Z), and is therefore uniform over all (fc, <Tq) 
but only for a fixed noise distribution (and hence fixed a 
and h(Z)). It blows up when the noise distribution has a 
long tail, or has a hugely negative differential entropy. In 
such cases, greater care is required in the design of implicit 
communication strategies. For instance, if the distribution is 
long-tailed, the quantization points need not be separated by 
a, but they can instead be separated by a distance sufficiently 
large so that the probability of mistaking one quantization 
point for another at the second controller is low. This insight 
is used in [2] to obtain strategies for the Gaussian case. 



D. Quantization-based strategies outperform linear strategies 
by an unbounded factor 

Consider the scalar case. A linear constraint on the second 
controller forces it to perform an LLSE estimation on the 
output Y2 in order to estimate X\. The first controller, also 
linear, uses an input U\ = aX . The resulting state Xi = 
(1 + o)Xq has variance a 2 = + a) 2 • The mean-squared 

estimation error is, therefore, . . Since this is an increasing 

„ 0-5+1 

function of a 2 , the optimizing a is negative. Since a 2 a 2 = P, 
aao — — yP- The total cost for the optimal linear strategy is 
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Clearly, this cost remains the same in the vector case as well. 
We now consider two cases. If P < 



controller. The cost of this strategy is therefore 3k 2 , which is 
attained in the event when the initial state is exactly at the 
edge of one of the quantization bins. 

If k 2 > 1, we use the zero-input strategy — the first 
controller inputs zero, and the second controller chooses 
U™ = Y™ as the estimate of X™. Since noise amplitude 
is bounded by \/3, the normalized error for this strategy is 
bounded by 3. 

The upper bound is therefore given by min{3£; 2 , 3}. 
Lower bound: Even though the noise is chosen adversarially 
(and deterministically), we first assume that the noise behaves 
as a random variable with distribution U(— \/3, and the 
initial state behaves as a Gaussian with variance a 2 for some 
(10) a 2 . > 0. We assume that the adversary declares this strategy in 
advance (which can only reduce the costs). From Theorem [2] 
if the first controller chooses an average power P, then the 
MMSE at the second controller is lower bounded by 
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where (a) follows from the fact that P < In the limit of 
a 2 —> 00 and k 0, this lower bound increases to 1, whereas 
the quantization upper bound of k 2 a 2 decreases to zero. 
Alternatively, if P > ?f, 

k 2 a 2 



Jlin > k Z P > 
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Thus, the ratio of the costs attained by the optimal linear 
strategy and those attained by the quantization upper bound 



is larger than k 4 — 

k -» 0, erg ~~ ^ 00 



k 2 a 2 
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which diverges to infinity as 



IV. Adversarial model for noise and state 

Theorem 4: The optimal cost J op tj r eq for adversarially 
modeled initial state and (bounded) noise Z G (— y3, v^3) 
with quadratic costs (as defined in Section II-B| > is bounded 
as follows 
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where the upper bound is achieved using quantization-based 
strategies complemented by linear strategies. Further, in the 
regime of k — > 0, the ratio of the costs attained by the best 
linear strategy to that attained by appropriate quantization- 
based nonlinear strategies diverges to infinity. 

Proof: Upper bound: If k 2 < 1, we use a uniform 
quantization strategy with bin size 2V3. Since the noise 
amplitude is smaller than \/3, there are no errors at the second 



MMSE > 



(MP) 
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Since this lower bound holds for all a 2 , we let a 2 —> 00, and 
obtain the following bound, 

2 

MMSE > ~ 




P 



where (a) follows from the fact that h(Z) = log 2 (2y/3) for 

Z ~l)(V3, \/3). 

A lower bound on the average costs (averaged over the 
initial state and noise realizations) for this problem is 

./„,,, > inf *■-/'+ J ( J^- ^Vp] I . (13) 
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At this point, if the adversary is allowed to use randomized 
strategies, we already have a proof of the lower bound. But 
what if it is required to play deterministically? We invoke an 
argument inspired by the probabilistic method [19] to address 
this requirement. For a fixed strategy 7, the lower bound 
in ( fT3] > holds on the cost J^^x™^™) averaged over x™ and 
z m for any choice of strategy 7. Thus, there exists a choice of 
realizations x^^ and z m(7) such that the cost (xg 1 , z m ) 
is at least as large as what the lower bound says it must be 
on average if x™ and z m were random. This cost is further 
lower bounded by the expression in ( fT3| . This proves the lower 
bound. 



Bounded ratios: Case 1: P* < 
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In this case, 
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which is also a lower bound on the cost. Thus the ratio of the 
zero-input upper bound (which is 3) and this lower bound is 
smaller than 3x2ne = 27re. 
Case 2: P* > 

In this case, the cost is no smaller than k 2 P* = k 2 -^—. Thus 

' 47re 

the ratio of quantization-based upper bound (which is 3k 2 ) 



and this lower bound is smaller than 



3STX 



P± = 2ire. 



The ratio of the upper and lower bound is therefore always 
smaller than 2ire ps 17.08. 

Nonlinear strategies can outperform linear by an arbitrary 
factor: The costs attained by quantization-based strategies are 
bounded by 3k 2 , regardless of the adversary's strategy. This 
gives an upper bound on the cost of nonlinear strategies. 
For linear strategies, we want to provide a lower bound. As 
in the proof of constant factor optimality, assume that the 
noise behaves as U(— v3, v3), and the initial state behaves 
as Af(0, Oq). The average costs attained by any linear strategy 
are lower bounded by 



inf k 2 P 
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Again, using the probabilistic method, there exists a realization 
of initial state and noise that attains a lower bound no 
smaller than the average. This gives a lower bound of ( fT4j ) 
on deterministic costs. The bounds on costs of linear and 



nonlinear strategies are the same as that in Section III-D with 



the substitution of a by \/3. The remaining proof is thus the 
same. ■ 
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Appendix I 

An upper bound on the mutual information across 
the x\ - y 2 channel 

I (XT; Y 2 m ) = h(Y?) - h(Y?\X?) 

< 5>(r 2 ,i) - /Wixy*) 
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mI(X i; Y 2 \Q) = m (h(Y 2 \Q) - h(Y 2 \X u Q)) 

m(h(Y 2 \Q)-h(Y 2 \X 1 )) 

m(h(Y 2 )-h(Y 2 \X 1 ))<mI(X 1 ;Y 2 ) ) 



where in (a), Xi — X\ i if Q = i (and Y 2 is defined 
similarly), and Q is distributed uniformly on the discrete set 
{1,2, . . . ,m}. Because Z is independent of Xq and U\, the 
variance of Y 2 — Xq + U\ + Z is maximized when Xq (of 



power a 2 ) and U\ (of power P) are aligned, and it equals 
(ao + VP) 2 + 1. Thus, 

I(X i; Y 2 ) = h(Y 2 ) - /i(F 2 |Xi) 
= h(Y 2 ) - h(Z) 

< i log 2 (2vre ((a + VP) 2 + h(Z) 



1 



2?re 



log 2 



((o-o + VP) 2 



2 2h(Z) 




, (15) 



where (a) follows from the observation that for given second 
moment of the random variable, the distribution that maxi- 
mizes the differential entropy is Gaussian. 
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